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Abstract 

We consider the following problem: given an unsorted array of n el- 
ements, and a sequence of intervals in the array, compute the median in 
each of the subarrays defined by the intervals. We describe a simple algo- 
rithm which uses 0(n) space and needs 0(n log k + k log n) time to answer 
the first k queries. This improves previous algorithms by a logarithmic 
factor and matches a lower bound for k = 0(n). Since the algorithm 
decomposes the range of element values rather than the array, it has nat- 
ural generalizations to higher dimensional problems - it reduces a range 
median query to a logarithmic number of range counting queries. 

1 Introduction and Related Work 

The classical problem of finding the median is to find the element of rank \n/2~\ 
in an unsorted array of n elements^ Clearly, the median can be found in 
O(nlogn) time by sorting the elements. However, a classical algorithm finds 
the median in 0{n) time |BFP + 72 , which is asymptotically optimal. 



More recently, the following generalization, called the Range Median Problem 
(RMP), has been considered [KMS051 IHPM08] : 

Input: An unsorted array A with n elements, each having a value. Furthermore, 
a sequence of k queries Qi, ■ ■ ■ , Qk, each defined as an interval Qi = [li, r,]. In 
general, the sequence is given in an online fashion, we want an answer after 
every query, and k is not known in advance. 

Output: A sequence x±, . . . ,Xk of values, where Xi is the median of the elements 
in A [i< , 7"i] - Here, ^4[Z,r] denotes the set of all elements whose index in A is at 
least I and at most r. 

This RMP naturally fits into a larger group of problems, in which an unsorted 
array is given, and in a query one wants to compute a certain function of all the 
elements in a given interval. Instead of the median, natural candidates for such 
a function are: 

• Sum: this problem can be trivially solved in 0(n) preprocessing time and 
O(l) query time by computing prefix sums. 



*Partially supported by DFG grant SA 933/3-1. 

1 An element has rank i if it is the i-th element in some sorted order. Actually, any 
specified rank might be of interest. We restrict ourselves to the median to simplify notation 
but a generalization to arbitrary ranks will be straightforward for all our results. 
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• Semigroup operator: this problem is significantly more difficult than the 
sum. However, there exists a very efficient solution: for any constant 
c, a preprocessing in 0(nc) time and space allows to answer queries in 
0(a c (n)) time, where ctk is a certain functional inverse of the Ackerman 
function |Yao82j . A matching lower bound was given in |Yao85j . 

• Maximum, Minimum: This is a special case of a semigroup operator, for 
which the problem can be solved slightly more efficiently: 0(n) prepro- 
cessing time and space is sufficient to allow 0(1) time queries (see e.g. 
|BFC04p . 

In addition to being a natural extension of the median problem, the RMP has 
some applications in practice, namely obtaining a "typical" element in a given 
time scries out of a given time interval [HPM08J. 

Natural special cases of the RMP are an offline variant, where all queries 
are given in a batch and a variant where we want to do all preprocessing up 
front and are then interested in good worst case bounds for answering a single 
query. 

The authors of [HPM08J give a solution of the RMP which requires 0(n log fc+ 
fclognlogfc) time and O(nlogfc) space. In addition, they give a lower bound 
of f2(nlogfc) time for comparison-based algorithms. They basically use a one- 
dimensional range tree over the input array, where each inner node corresponds 
to a subarray defined by an interval. Each such subarray is sorted, and stored 
with the node. A range median query then corresponds to selecting the median 
from O(logfc) sorted subarrays (whose union is the queried subarray) of total 
length 0(n), which requires O(lognlogfc) time. The main difficulty of their 
approach is to show that the subarrays need not be fully sorted, but only pre- 
sorted in a particular way, which reduces the construction time of the tree from 
O(nlogn) to 0(n log k). 

Concerning the preprocessing variant of the RMP, [KMS05J give a data 
structure and answers queries in 0(log n) time, which uses 0(n log 2 n/ log log n) 
space They do not analyze the required preprocessing time, but it is clearly at 
least as large as the required space in words. Moreover, they give a structure 
which uses only 0(n) space, but query time 0(n e ) for arbitrary e > 0. An- 
other data structure given in |Pet08| can answer queries in 0(1) time and uses 
0(n 2 log^ n/ logn) space, where p is an arbitrary integer, and log^' n is the p 
times iterated logarithm of 

Our results. 

First, in Section [2] we give an algorithm for the pointer-machine model which 
solves the RMP in 0{n log fc + fclogn) time and O(nlogfc) space. This im- 
proves the running time of 0(nlogfc + fclogn log fc) reported in [HPM08J for 
fc G a;(n/logn). Our algorithm is also considerably simpler. The idea is to 
reduce a range median query to a logarithmic number of related range counting 
queries by recursively partitioning the values in array A in a tree in a fashion 
similar to Quicksort. The final time bound is achieved using the technique of 
Fractional Cascading. In Section [2.1| we explain why our algorithm is optimal 
for fc £ 0(n) and at most f2(logn) from optimal for fc s w(n). 

2 Note that the data structures in KMS05 PctOS work only for a specific quantile (e.g. 
the median), which must be the same for all queries. 
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Figure 1: An example for the array A = [3, 7, 5.5, 4, 9, 6.2, 9, 4, 2, 5]. The median 
value used to split the set of points is 5. For the query L — 3, R — 8, there are 
two elements inside A.low[L, R] and four elements in A.high[L, R\. Hence, the 
median of A[L, R] is the element of rankp = |~(2 + 4)/2] — 2 = 1 in A.high[L, R]. 



Section [3] achieves linear space in the RAM model using techniques from 
succinct data structures - the range counting problems are reduced to rank 
computations in bit arrays. To achieve the desired bound, we compress the 
recursive subproblems in such a way that the bit arrays remain dense at all 
times. 

The latter algorithm can be easily modified to obtain a data structure using 
0(n) space, can be constructed in 0{n log n) time, and allows to answer arbi- 
trary range median queries (or an arbitrary rank, which may be specified online 
together with the query interval) in O(logn) time. Note that the previously 
best linear-space data structure required 0{n e ) query time [KMS05J. 

After a few remarks on generalizations for higher dimensional inputs in Sec- 
tion HI we discuss dynamic variants of the RMP problem in Section [51 before 
Section E] concludes with a summary and some open problems. 

2 A Pointer Machine Algorithm 

Our algorithm is based is the following key observation (see also Figure []}: 
Suppose that we partition the elements in array A of length n into two smaller 
arrays: A.low which contains all elements with the n /2 smallest values in 
A, and A. high which contains all elements with the n/2 largest values. The 
elements in A.low and A.high are sorted by their index in A, and each element 
e in A.low and A.high is associated with its index e.i in the original input array 
and its value e.v. Now, if we want to find the element of rank p in the subarray 
A[L, R], we can do the following: We count the number m of elements in A.low 

3 To simplify notation we ignore some trivial rounding issues and also sometimes assume 
that all elements have unique values. This is without loss of generality because we could 
artificially expand the size A to the next power of two and because we can use the position of 
an element in A to break ties in element comparisons. 
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which are contained in A[L,R\. To obtain m, we can do a binary search for 
both L and R in A.low (using the e.i fields). If p < m, then the element of rank 
p in A[L,R) is the element of rank p in A.low[L, R]. Otherwise, the element of 
rank p is the element of rank p-min A.high[L, R]. 

Hence, using the partition of A into A.low and A. high, we can reduce the 
problem of finding an element of a given rank in array A[L. R] to the same prob- 
lem, but on a smaller array (either A.low[L, R] or A.high[L, R]). Our algorithm 
applies this reduction recursively. 

Algorithm overview. The basic idea is therefore to subdivide the n elements 
in the array into two parts of (almost) equal size by computing the median of 
their values and using it to split the list into a list of the n/2 elements with 
smaller values and a list of the n/2 elements with larger values. The two parts 
are recursively subdivided further, but only when required by a query. To answer 
a range median query, we determine in which of the two parts the element of the 
desired rank lies (initially, this rank corresponds to the median, but this may 
change during the search) . Once this is known, the search continues recursively 
in the appropriate part until a trivial problem of constant size is encountered. 

We will show that the total work involved in splitting the subarrays is 
O(nlogfc) and that the search required for any query can be completed in 
O(logn) time using Fractional Cascading CG86 . Hence, the total running 
time is O (n log k + k log n) . 



Algorithm 1: Query (A, L, R,p) 



1 Input: range select data structure A of elements, query range [L,R], 
desired rank p 

2 if \A\ = 1 then return ^4[1] 

3 if A.low is undefined then 

4 
5 
6 
7 



Compute median x value of the pairs in A 
A.low := (e G A : e.v < x) 
A.high := (e S A : e.v > x) 

{ (e G A : Q) is an array containing all elements e of A satisfying the 
given condition Q, ordered as in A } 

8 { Find(A, q) returns max{j : A[j].i < q} (with Find(A,0) = 0) } 

9 / := Find(AJow, L — 1) ; // # of low elements left of L 
10 r := Fmd(A.low 7 R) ; // # of low elements up to R 
n m := r — I ; // # of low elements between L and R 

12 if p < rn then return Query(Alow, L, R, p) 

13 else return Qucry(Ahigh, L, R, p ~ m) 



Detailed description and analysis. Algorithm [T] gives pseudo-code for the 
query, which performs preprocessing (i.e., splitting the array into two smaller 
arrays) only where needed. Note that we have to keep three things separate here: 
values that are relevant for median computation and partitioning the input, 
positions in the input sequence that are relevant for finding the elements within 
the range [L, R], and positions in the subdivided arrays that are important for 
counting elements. 
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Let us first analyze the time required for processing a query not counting 
the 'preprocessing' time within lines 4-6: The query descends log 2 n levels of 
recursion. On each level, .Find-operations for L and R are performed on the 
lower half of the current subproblcm. If we used binary search, we would get a 
total execution time of up to ^ n O(logS) = 9(log 2 n). However, the fact 
that in all these searches, we search for the same key (L or R) allows us to use a 
standard technique called Fractional Cascading CG86I that reduces the search 
time to a constant, once the position in the first search is known. Indeed, we 
only need a rather basic variant of Fractional Cascading, which applies when 
each successor list is a sublist of the previous one dBvKOSOQj. Here, it suffices 
to augment an element e of a list with a pointer to the position of some element 
e' in each subsequent list (we have two successors - A.low and A. high). In our 
case, we need to point to the largest element in the successor that is no larger 
than e. We get a total search time of O(logn). 

Now we turn to the preprocessing code in lines 4-6 of Algorithm [TJ Let 
s(i) denote the level of recursion at which query i encountered an undefined 
array A.low for the first time. Then the preprocessing time performed during 
query i is 0(n/2 s W) if a linear time algorithm is used for median selection 
[BFP+72] (note that we have a linear recursion with geometrically decreasing 
execution times). This preprocessing time also includes the cost of finding the 
pointers for Fractional Cascading while splitting the list in lines 4-6. Since the 
preprocessing time during query i decreases with s{i), the total preprocessing 
time is maximized if small levels s(i) appear as often as possible. However, 
level j can appear no more than 2 J times in the sequence s(l), s(2), . . . , s(fc)o 
Hence, we get an upper bound for the preprocessing time when the smallest 
[logfcj levels are used as often as possible ('filled') and the remaining levels are 
[logfc]. The preprocessing time at every used level is 0(n) giving a total time 
of 0(n log fc). The same bound applies to the space consumption since we never 
allocate memory that is not used later. We summarize the main result of this 
section in a theorem: 

Theorem 1. The online range median problem (RMP) on an array with n 
elements and k range queries can be solved in time 0(n\ogk + fclogn) and 
space 0(n log k). 

Another variant of the above algorithm invests 0(n log n) time and space 
into complete preprocessing up front. Subsequently, any range median query 
can be answered in O(logn) time. This improves the preprocessing space of the 
corresponding result in [KMS05] by a factor log n / log log n and the preprocess- 
ing time by at least this factor. 

2.1 Lower Bounds 

We shortly discuss how far our algorithm is from optimality. In |HPM08| . 
a comparison-based lower bound of fi(n logfc) is shown for the range median 
problem 0. As our algorithm shows, this bound is (asymptotically) tight if 

4 Indccd, for j > the maximal number is 2 J_1 since the other half of the available 
subintervals have already been covered by the preprocessing happening in the layer above. 

5 The authors derive a lower bound of I := k} ^ n J^ iy.) k ' w ^ ere n ls a mu ltiple of fc < n. 
Unfortunately, the analysis of the asymptotics of I given in HPM08 is erroneous; however, a 
corrected analysis shows that the claimed Q(n logfc) bound holds. 
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k G 0(n). For larger k, the above lower bound is no longer valid, as the 
construction requires k < n. Yet, a lower bound of Q(n\ogn) is immediate 
for k > ?i, considering only the first n — 1 queries. Furthermore, fl(k) is a 
trivial lower bound. Note that in the analysis of our algorithm, the number of 
levels of the recursion is actually bounded by 0(min{log k, logn}), and thus for 
any k > n our algorithm has running time 0(n\ogn + fclogn), which is up to 
f2(logn) from optimal for any k. 

In a very restricted model (sometimes called "Pointer Machine"), where a 
memory location can be reached only by following pointers, and not by direct 
addressing, our algorithm is indeed optimal also for k > n: it takes f2(logn) 
time to even access an arbitrary element of the input. Since every element 
of the input is the answer to at least one range query (e.g. the query whose 
range contains only this element), the bound follows. An interesting question is 
whether a lower bound f2(fclogn) could be shown in stronger models. However, 
note that any comparison-based lower bound (as the one in HPM08]) cannot be 
higher than Vt(n\ogn): With 0(n\ogn) comparisons, an algorithm can deter- 
mine the permutation of the array elements, which suffices to answer any query 
without further element comparisons. Therefore, one would need to consider 
stronger models (e.g. the "cell-probe" model), in which proving lower bounds 
is significantly more difficult. 



3 A Linear Space RAM Implementation 



Algorithm 2: Query(A, L, R,p) 



1 Input: range select data structure A, query range [L,R] and desired 
rank p 

2 if |A| = 1 then return A[l] 

3 if A.low is undefined then 

4 
5 
6 
7 
8 



Compute median x value of the values in A 
A.lowbits := Bit Vector (| A , {i £ l..\A\ : A[i] < x}) 
A.low := {A[i] : i € l..|A|, A[i] < x) 
A.high := (A[i\ : i £ l..|A|,A[i] > x) 
deallocate the value array of A itself 
9 I := A.lowbits.rank(L — 1) 
10 r := A.lowbits.rank(i?) 
n m := r — I 

12 if p < rn then return Query (A. low, I + 1, r, p) 

13 else return Query(A.high, L — /, R — r, p — m) 



Our starting point for a more space efficient implementation of Algorithm [T] 
is the observation that we do not actually need all the information available 
in the arrays stored at the interior nodes of our data structure. All we need 
is support for the operation Find(x) that counts the number of elements e 
in A.low that have index e.i < x. This information can already be obtained 
from a bit-vector where a 1-bit indicates whether an element of the original 
array is in A.low. For this bit-vector, the operation corresponding to Find is 
called rank. In the RAM model, there are data structures that need space 
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n + o(n) bits, can be constructed in linear time and support rank in constant 
time (e.g., |Cia88l IU506fh . Unfortunately, this idea alone is not enough since 
we would need to store 2 J bit arrays consisting of n position each on level j. 
Summed over all levels, this would still need f2(nlog 2 n) bits of space even if 
optimally compressed data structures were used. This problem is solved using 
an additional idea: for a node of our data structure with value array A, we do not 
store a bit array with n possible positions but only with \A\ possible positions, 
i.e., bits represent positions in A rather than in the original input array. This 
way, we have n positions on every level leading to a total space consumption 
of 0(n log n) bits. For this idea to work, we need to be able to transform the 
query range in the recursive call in such a way that rank operations in the 
contracted bit arrays are meaningful. Fortunately, this is easy because the rank 
information we compute also defines the query range in the contracted arrays. 
Algorithm [2] gives pseudocode specifying the details. Note that the algorithm is 
largely analogous to Algorithm^ In some sense, the algorithm becomes simpler 
because the distinction between query positions and array positions for counting 
disappears (If we still want to report the positions of the median values in the 
input, we can store this information at the leaves of the data structure using 
linear space) . Using an analysis analogous to the analysis of Algorithm [T] we 
obtain the following theorem: 

Theorem 2. The online range median problem (RMP) on an array with n 
elements and k range queries can be solved in time 0(n\ogk + fclogn) and 
space 0(n) words on the RAM model. 

By doing all the preprocessing up front, we obtain an algorithm with prepro- 
cessing time 0(n log n) using 0(n) space and query time 0(log n) . This improves 
the space consumption compared to KMS05 by a factor log 2 nj log log n. 



4 Higher Dimensions 

Since our algorithm decomposes the values rather than the positions of elements, 
it can be naturally generalized to higher dimensional point sets. Wc obtain an 
algorithm that needs 0{n log k) preprocessing time plus the time for supporting 
range counting queries on each level. The amortized query time is the time for 
O(logn) range counting queries. Note that query ranges can be specified in 
any way we wish: (hyper)-rectangles, circles, etc., without affecting the way we 
handle values. For example, using the data structure for 2D range counting from 
JMS04] we obtain a data structure for the 2D rectangular range median problem 
that needs O(nlognlogfc) preprocessing time, 0(n log k/ log log n) space, and 
(3(log 2 nj log log n) query time. This not only applies to 2D arrays consisting of 
n input points put to arbitrary two-dimensional point sets with n elements. 

Unfortunately, further improvements, e.g. to logarithmic query time seem 
difficult. Although the query range is the same at all levels of recursion, Frac- 
tional Cascading becomes less effective when the result of a rectangular range 

6 Indeed, since we only need the rank operation, there are very simple and efficient imple- 
mentations: store a table with ranks for indices that are a multiple of w = 0(logn). General 
ranks are then the sum of the next smaller table entry and the number of 1-bits in the bit ar- 
ray between this rounded position and the query position. Some processors have a POPCNT 
instruction for this purpose. Otherwise we can use lookup tables. 
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counting query is denned by more than a constant number of positions within 
the data structure because we would have to follow many forwarding pointers. 
Also, the array contraction trick that allowed us to use dense bit arrays in Sec- 
tion [3] does not work anymore because an array with half the number of bits 
need not contain any empty rows or columns. 

Another indication that logarithmic query time in two dimensions might be 
difficult to achieve is that there has been intensive work on the more specialized 
median-filtering problem in image processing where we ask for all range medians 
with query ranges that are squares of size 2r + 1 x 2r+ 1 in an image with n pixels. 
The best previous algorithms known here need time 0(nlog 2 r) [GW93J unless 
the range of values is very small |PH07( ICWE07] . Our result above improves 
this by a factor log log r (by applying the general algorithm to input pieces of 
size 3r x 3r) but this seems to be of theoretical interest only. 

5 Dynamic Range Medians 

In this section, we consider a dynamic variant of the RMP, where we have a 
linked list instead of an array, and elements can be deleted or inserted arbitrarily. 
In this setting, we still want to answer median queries, whose range is given by 
two pointers to the first and the last element in the query range. 

In the following, we sketch a solution which allows inserts and deletes in 
0(log 2 n) amortized time each, and range median queries in 0(log 2 n) worst 
case time. The basic idea is to use a BB(a) tree |NR72) as a primary structure, 
in which all elements are ordered by their value. With each inner node, we 
associate a secondary structure, which contains all the elements of the node's 
subtree, ordered by their position in the input list. More precisely, we store 
these elements in a balanced binary search tree, where nodes are augmented 
with a field indicating the size of their subtree, see e.g. [RouOlj . This data 
structure permits to answer a range query by a simple adaptation of Algorithm!!] 
starting at the root, we determine the number of elements within the query range 
which are in the left subtree, and depending on the result continue the search 
for the median in the left or in the right subtree. The required counting in 
each secondary structure takes O(logn) time, and we need to perform at most 
O(logn) such searches for any query. When an element is inserted or deleted, 
we follow the search path in the BB(a) tree according to its value, and update 
all the O(logn) secondary structures of the visited nodes. The main difficulty 
arises when a rotation in the BB(a) tree is required: in this case, the secondary 
structures arc rebuilt from scratch, which costs O(plogp) time if the subtree 
which is rotated contains p nodes. However, as shown in |Meh84[ |WL85| . such 
rotations are required so rarely that the amortized time of such an event is only 
O(logplogn) = 0(log 2 n). 

We note that using this dynamic data structure for the one-dimensional 
RMP, we can implement a two-dimensional median filter, by scanning over the 
image, maintaining all the pixels in a strip of width r. In this way, we obtain a 
running time of 0(log 2 r) per pixel, which matches the state-of-the-art solution 
for this problem {GW93 . This indicates that obtaining a solution with O(logn) 
time for all operations could be difficult. 
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6 Conclusion 



We have presented improved upper bounds for the range median problem. 

The query time of our solution is asymptotically optimal for k e 0(n), or 
when all preprocessing has to be done up front. For larger values of k, our 
solution is at most a factor log n from optimal. In a very restricted model where 
no arrays are allowed, our solution is optimal for all k. Moreover, in the RAM 
model, our data structure requires only 0{n) space, which is clearly optimal. 
Making the data structure dynamic adds a factor logn to the query time. Using 
0(n 2 ) space, it is trivial to precompute all medians of a given array so that the 
query time becomes constant. However, it is open whether the term O(klogn) 
in the query time could be reduced towards 0(k) in the RAM model when 
k = o{n 2 ). 

Given the simplicity of our data structure, a practical implementation would 
be easily possible. To avoid the large constants involved when computing me- 
dians for recursively splitting the array, one could use a pivot chosen uniformly 
at random. This should work well in expectation. 

It would be interesting to find faster solutions for the dynamic RMP or the 
two-dimensional (static) RMP: Either would lead to a faster median filter for 
images, which is a basic tool in image processing. 
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